Online Hoeffding Bound Algorithm for Segmenting Time Series Stream Data

نویسندگان

  • Dima ALBERG
  • Avner BEN-YAIR
چکیده

In this paper we introduce the ISW (Interval Sliding Window) algorithm, which is applicable to numerical time series data streams and uses as input the combined Hoeffding bound confidence level parameter rather than the maximum error threshold. The proposed algorithm has two advantages: first, it allows performance comparisons across different time series data streams without changing the algorithm settings, and second, it does not require preprocessing the original time series data stream in order to determine heuristically the reasonable error value. The proposed algorithm was implemented in two modes: off line and online. Finally, an empirical evaluation was performed on two types of time series data: stationary (normally distributed data) and non stationary (financial data).

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Segmenting Big Data Time Series Stream Data

Big data time series data streams are ubiquitous in finance, meteorology and engineering. It may be impossible to process an entire “big data” continuous data stream or to scan through it multiple times due to its tremendous volume. In Heraclitus’s well-known saying, “you never step in the same stream twice,” and so it is with “big data” temporal data streams. Unlike traditional data sets, big ...

متن کامل

Online Streaming Feature Selection Using Geometric Series of the Adjacency Matrix of Features

Feature Selection (FS) is an important pre-processing step in machine learning and data mining. All the traditional feature selection methods assume that the entire feature space is available from the beginning. However, online streaming features (OSF) are an integral part of many real-world applications. In OSF, the number of training examples is fixed while the number of features grows with t...

متن کامل

PRESEE: An MDL/MML Algorithm to Time-Series Stream Segmenting

Time-series stream is one of the most common data types in data mining field. It is prevalent in fields such as stock market, ecology, and medical care. Segmentation is a key step to accelerate the processing speed of time-series stream mining. Previous algorithms for segmenting mainly focused on the issue of ameliorating precision instead of paying much attention to the efficiency. Moreover, t...

متن کامل

Algorithms for Segmenting Time Series

As with most computer science problems, representation of the data is the key to ecient and eective solutions. Piecewise linear representation has been used for the representation of the data. This representation has been used by various researchers to support clustering, classication, indexing and association rule mining of time series data. A variety of algorithms have been proposed to obtain...

متن کامل

A MPAA-Based Iterative Clustering Algorithm Augmented by Nearest Neighbors Search for Time-Series Data Streams

In streaming time series the Clustering problem is more complex, since the dynamic nature of streaming data makes previous clustering methods inappropriate. In this paper, we propose firstly a new method to evaluate Clustering in streaming time series databases. First, we introduce a novel multiresolution PAA (MPAA) transform to achieve our iterative clustering algorithm. The method is based on...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2011